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1.  Introduction 


Monte  Carlo  sampling  strategies  are  commonly  used  to  numerically  evaluate  high¬ 
dimensional  multiple  integrals.  As  is  well  known,  the  associated  estimators  enjc^  the  nice 
property  that  their  convergence  rates  are  relatively  insensitive  to  dimensional  effects.  In 
particular,  the  central  limit  theorem  (CLT)  implies  that  Monte  Carlo  estimators  typically 
converge  at  rate  where  n  is  the  number  of  function  evaluations  made,  and  this  rate 

is  independent  of  the  dimension  d  of  the  integral.  In  contrast,  conventional  deterministic 
integration  schemes  often  suffer'  from  a  convergence  rate  of  the  order  (See,  for 

example,  Davis  and  Rabinowitz  1984). 

Despite  the  high-dimensional  advantages  associated  with  Monte  Carlo  sampling,  it  is 
sometimes  possible  to  improve  upon  its  convergence  characteristics.  We  note  that  Monte 
Carlo  sampling  schemes  typically  construct  estimators  that  involve  sample  means.  A 
sample  mean  is  an  average  over  the  associated  observations  in  which  each  observation  is 
equally  weighted.  (In  the  Monte  Carlo  int<^ation  setting,  an  observation  is  basically  just 
a  frmction  evaluation  computed  at  some  randomly  chosen  point.)  Such  sample  means  use 
only  the  information  that  is  present  in  the  function  evaluations  themselves.  All  information 
about  the  random  locations  at  which  the  function  evaluations  were  computed  is  discarded. 

This  note  concerns  the  improvements  in  estimation  made  possible  by  taking  location 
information  into  account.  Our  main  resiilt  shows  that  variance  is  dramatically  reduced  for 
one-dimensional  integration  problems  when  location  information  is  exploited.  In  particular, 
the  convergence  rate  of  the  resulting  estimator  can  be  improved  from  rate  to  rate  n~^ 
(under  suitable  smoothness  conditions  on  the  integrand).  This  result  suggests  that  it  may 
be  advantageous,  in  future  research,  to  consider  high-dimensional  Monte  Carlo  sampling 
algorithms  that  are  somehow  able  to  take  advantage  of  the  presence  of  the  additional 
function  evaluation  location  information. 
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2.  Description  of  the  main  result 


Suppose  that  we  wish  to  integrate  a  function  /  over  the  unit  interval  [0, 1].  (Note  that 
by  a  suitable  change  of  variable,  all  one-dimensional  integration  problems  may  be  reduced 
to  this  form.)  The  naive  Monte  Carlo  approach  to  the  estimation  of 


a  =  /  f[x)dx 
Jo 

begins  with  generating  n  independent  and  identically  distributed  (iid)  uniform  [0, 1]  ran¬ 
dom  variables  The  function  /  is  then  evaluated  at  each  of  the  n  locations 

Ui,..., Un,  so  that  f{Ui f{Un)  are  obtained,  and  the  integral  or  is  estimated  by  the 
sample  mean 


*  i»l 

Note  that  On  is  a  function  purely  of  /(C^i), . . . , /(^n)>  whereas  the  information  available 
to  the  simvdator  comprises  the  larger  set  {(C^i,/(t^i))  :  »  =  1,...  ,n}.  In  any  case,  the 
CLT  implies  that  if  a*  =  var  f{U\)  <  oo,  then 


n‘/^(an  -  a)  =>  <r/'(0,l) 


(2.1) 


f 


as  n  -t  00,  where  N(0, 1)  denotes  a  normal  random  variable  with  zero  mean  and  unit 
variance.  The  weak  convergence  result  (2.1)  establishes  that  the  convergence  rate  of  On 
to  a  is  of  the  order  of  in  the  number  n  of  function  evaluations  computed.  This 
convergence  rate  is  relatively  slow,  especially  compared  in  the  one-dimensional  setting  to 
the  faster  convergence  rates  of  a  number  of  deterministic  quadrature  fwmulae  that  are 
described,  for  example,  by  Davis  and  Rabinowitz  (1984). 

Becatise  of  the  random  sampling  that  is  b«ng  used,  the  “spacing”  between  the  loca¬ 
tions  Ui,...yUn  will  not  be  regular.  In  fact,  it  is  well  known  (see  Devroye  1982)  that  if 
/Cm  is  the  size  of  the  largest  subinterval  into  which  [0, 1]  is  partitioned  by  t/i, . . . ,  £7^1  then 


n 


a.s. 
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as  n  — >  oo.  (For  sequences  {a„}  and  {6n},  the  notation  On  ~  b„  is  used  if  a„/6n  -+  1 
as  n  —*  00.)  Although  it  may  be  thought  that  the  slow  rate  of  convergence  for  On  is  a 
consequence  of  the  irregular  spacing,  argtunents  given  below  show  that  this  is  not  the  case. 

Given  that  the  integral  a  can  be  computed  via  Riemann  sum  approximations,  it 
makes  sense  to  weight  a  given  function  evaluation  according  to  the  size  of  the  subinterval 
over  which  it  is  assumed  to  approximate  the  ori^nal  function  /.  More  specifically,  let 
Ui(n), . . . ,  Unin)  be  the  order  statistics  of  the  sample  C^i, . . . ,  Un,  so  that  Uk{n)  denotes 
the  kth  largest  observation  in  the  sample  of  n  iid  uniform  random  variables.  We  then 
approximate  /  over  the  interval  [(t/’,_i(n)+t/i(n))/2,  (C/’i(n)+Lri^j(n))/2]  by  the  constant 
value  /(l/’i(n)),  to  arrive  at  the  approximation 

id 

to  the  integral  a,  where  Loin)  =  0,  I<n(n)  =  1>  and  I-i(n)  =  (C^i(n)  +  C/^,+i(n))/2  for 
1  <  X  <  n  -  1.  Since  the  random  variable  Xi(n)  lies  at  the  midpoint  between  l^i(n)  and 
Ui+iin)  for  1  <  X  <  n  —  1,  dn  is  a  randomized  version  of  the  midpoint  quadrature  rule  of 
classical  numerical  integration. 

The  main  result  is  the  following  theorem,  which  is  proved  in  the  Appendix. 

Theorem  1.  Suppose  that  f  is  twice  continuously  dUferentiable  over  (0,1).  Then 


(*;  - 1)  - 


/»>(0) 

2 


(XJ  - 1) 


(2.2) 


as  n  00,  where  Xg  and  Xi  ar^  independent  chi-squared  random  variables  each  with  two 
degrees  of  freedom. 

Theorem  1  shows  that  when  function  evaluation  location  information  is  exploited,  the 
rate  of  convergence  can  be  dramatically  improved.  Specifically,  upon  comparison  of  (2.2) 
to  (2.1),  it  is  evident  that  using  d^  in  place  of  a„  improves  the  covergence  rate  from 
to  n“*. 
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It  is  also  instructive  to  consider  the  convergence  rate  that  arises  when  a  determin¬ 
istic  grid  is  used  to  specify  the  points  at  which  function  evaluations  are  to  be  com¬ 
puted.  In  particiilar,  suppose  that  the  n  function  evaluations  are  performed  at  the  points 
l/(2n),  3/(2n), . . . ,  (2n  —  l)/(2n),  and  consider  the  estimator 


Argxunents  similar  to  those  used  in  the  proof  of  Theorem  1  show  that 

n=(o;  -  a)  ^  (/<■>(!)  -  /<‘'(0))/24.  (2.3) 

Since  .^(xo)  ~  ^(x?)  =  2,  results  (2.2)  and  (2.3)  together  suggest  that  the  leading  term  of 
the  expected  error  for  the  ‘Randomized  midpoint  quadrature  rule”  is  exactly  twelve  times 
that  of  the  error  for  the  deterministic  midpdnt  quadrature  rule.  However,  a  feature  of 
the  estimator  a*  is  that  it  can  not  be  recursively  updated,  since  the  grid  for  sample  size 
n  -I- 1  is  different  from  that  for  sample  size  n.  In  contrast,  the  estimator  dn^i  merely  adds 
a  function  evaluation  at  the  point  Un+i  to  those  already  computed  at  l/i, . . .  ,Un>  Thus, 
the  factor  of  twelve  can  be  viewed  as  a  penalty  paid  to  obtain  an  estimator  that  can  be 
recursively  updated.  Of  course,  the  recursive  update  of  dn^-i  from  dn  would  engender  a 
significant  computational  burden.  The  integer  i  for  which  the  subinterval  [£i-i  (n),Ii(n)] 
contains  Un+i  would  need  to  be  determined,  and  the  cost  of  finding  i  woxild  increase  with 
the  sample  size  n.  Indeed,  for  very  large  n,  the  cost  could  become  prohibitively  expensive. 

This  analysis  in  the  one-dimensional  setting  suggests  that  use  of  function  evaluation 
location  information  is  particularly  appropriate  for  situations  where  the  proposed  sample 
size  n  is  moderate  and  function  evaluations  are  expensive.  Theorem  1  indicates  that  the 
slow  convergence  rate  of  associated  with  niuve  Monte  Carlo  estimation  is  (for  one¬ 
dimensional  problems,  at  least)  not  a  consequence  of  the  irregular  spacing  of  the  function 
evaluations,  but  rather  it  results  from  the  uniform  weights  that  are  rised  in  constructing 
the  sample  mean.  Limit  (2.3)  indicates  that  the  rate  of  convergence  for  a  properly  weighted 


Monte  Carlo  estimator  is  identical  to  that  of  the  corresponding  deterministic  qiiadrature 
rrile  using  a  regtJarly  spaced  grid. 

Another  approach  to  improving  the  rate  of  convergence  of  On  would  be  by  developing  a 
randomized  version  of  Simpson’s  integration  rule.  The  results  observed  here  for  quadrature 
niles  suggest  that  the  error  of  Simpson’s  rule  on  a  regular  deterministic  grid  would  be 
asymptotically  smaller  (at  least  by  a  constant  factor)  than  the  error  of  a  randomized 
version. 

The  primary  purpose  of  this  paper  is  to  observe  that  the  performance  of  Monte  Carlo 
sampling  schemes  can  sometimes  be  substantially  improved  by  taking  advantage  of  function 
evaluation  locaticm  information.  Given  that  deterministic  quadrature  rules  are  likely  to 
be  the  methods  of  choice  for  one-dimensional  integration  problems  (particularly  when 
the  integrand  is  smooth),  the  real  impact  of  this  observation  lies  in  the  possible  use  of 
location  information  for  problems  involving  high-dimensional  integrals,  where  Monte  Carlo 
sampling  is  most  frequently  employed. 

The  theory  presented  here  bears  a  superficial  resemblance  to  a  rotation  sampling 
scheme  proposed  by  Fishman  and  Huang  (1983).  Their  method  involves  using  a  randomly 
translated  regular  grid  with  constant  spacing,  and  it  is  effectively  identical  to  the  well 
known  rectangular  integration  rule,  as  pointed  out  by  Glynn  and  Whitt  (1992).  Our 
result,  in  contrast,  involves  function  evaluations  that  lie  on  a  randomly  generated  and 
irregularly  spaced  grid. 


Conclusions 

We  have  shown  that  the  use  of  function  evaluation  location  information  can  dramat¬ 
ically  improve  the  performance  of  one-dimensional  sampling  schemes.  This  improvement 
arises  via  an  appropriate  reweighting  of  the  function  evaluations  so  that  they  no  longer 
necessarily  receive  identical  weights.  The  results  obtained  here  suggest  that  it  may  be  es¬ 
pecially  beneficial  to  judiciously  incorporate  location  iirformation  when  using  Monte  Carlo 
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sampling  methods  for  estimation  of  high-dimoisional  integrals.^ 

^  The  research  of  the  first  author  was  supported  by  National  Science  Foimdation 
Contract  No.  DMS59204864.  The  research  of  the  second  author  was  supported  by  Army 
Research  Office  Contract  No.  DAAL03-91-G-0101  and  by  National  Science  Foundation 
Contract  No.  DDM91-01580. 


Appendix 


Proof  of  Theorem  1.  Observe  that 

"  ^L,(n) 


a-an  =  ^  /  [/(^)  -  fiUiin))]dx. 

isi  JLi-iin) 

Since  /  is  twice  continuously  differentiable, 

/(i)  -  f{Vi{n))  =  /"'(t^i(n))(i  -  U,(n))  +  i/'»' («.("))(*  - 


(A.1) 


where  4t(n)  between  x  and  Ui(n).  Now, 

where  Ai(n)  =  {Ui{n)  -  Ui~i{n))/2.  By  tising  summation  parts,  the  latter  sum  can  be 
written  ss 


jm7 

-  5/“’(t^.("))A»(")’  -  5  E  /‘”(-rj{>>))Ai(n)’.  (A.2) 

jm7 

where  7>(n)  lies  between  Uj^i{n)  and  Uf{n)  ()  =s  2,.. .  ,n  -  1). 

It  is  possible  at  this  point  to  assume  that  Uj{n)  s  TjfTn-i-i  (j  =  1, . . .  ,n),  where  Tj  is 
the  ;th  jump  time  of  a  Poisson  process  N{t)  having  unit  rate,  since  (C^i(n), . . . ,  Un(n))  is 
distributed  as  (Ti/Tn.t.i,. . .  ,Tn/T,».fi)  for  each  n  >  1.  (See,  for  example,  Devroye  1982). 
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Since  is  continuous  (and  hence  \mifonnly  continuous)  over  [0, 1],  there  exists  an 
integer  k  —  k(€)  for  each  e  >  0  such  that  “  f^^Kv)\  ^  *  whenever  |x  —  y|  <  1/k. 

Then 

jm7 

N(Tn+t/k)-l  t_l  Ar(.T,+i/*)-l 

>{/‘’*(0)-«}  E  Aj(n)>  + 53  {/»>((.• -l)/i)-.}  53  A,(n)’ 

+  {/•’>((*  -  1)/*)  -  «}  E 

i»iv((fc-l)T«+i/fc) 

Let  Tj  =  Tj  —Tj-i  (j  =  +  1)  and  observe  that  Ti,...,Tn+i  are  independent 

exponential  random  variables,  each  having  mean  1.  Since  Tn/n  1  a.8.  as  n  oo,  it  is 
evident  that 

i-iv((i-i)r,+,/fc) 

^  /  n  y  jy(.T.+i/t)  -  Af((.  -  1)W*) 

It.+J  w(ir.+./i)  -  Ar((i  -  i)r.4../t)  8t.+. 

If)  3 

■* 

as  n  00.  Hence, 

lim  n’  52  ^  A  ^(/^’’(*/*)  ”  <) 


and  similarly, 


ns  n’  52  ^  52(/^’^(*/^) + <) 


Since  e  was  arbitrary  and  since 


i  E  w*)  -  /'  =  /“’(I)  -  /“’(O). 

*  ;_A  </  0 
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it  follows  that 


r 

^  -  /^^HO))  a-8-  (^-3) 

J=2 

A  similar  argument  proves  that 

2  fLi(n)  r 

tE/  /“'({.("))(* -f'i("))<**-7(/“’(l)-/“'(0))  (-«••*) 

as  n  -+  00.  Furthermore, 

and 

a... 

as  n  -♦  00.  Whereas 

La/  /W(^,(n))(*- t;i(n))  di  <  n*  sup  |/(*>(z)|Ij(n)* 

I  Jo 

<  n*  sup  0  a.s., 

0<*<1  *n+l 

it  is  evident  that 

n’j[*^‘*"’l/(*)-/(£^i(n))l<fe-nV‘‘>(I'i(n))^2^--/<‘>(0)^  w.  (A.5) 

as  n  -*  00.  A  similar  analysis  holds  for  the  error  over  the  final  subinterval  [Xm-i  (n),l]. 

Since  the  square  of  an  exponential  random  variable  with  mean  1  has  the  chi-squared 
distribution  with  2  degrees  of  freedom,  Theorem  1  follows  upon  combining  (A.1)-(A.5). 
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